Training and delayed reinforcements in Q-learning agents

نویسندگان

  • Pierguido V. C. Caironi
  • Marco Dorigo
چکیده

Q-learning can greatly improve its convergence speed if helped by immediate reinforcements provided by a trainer able to judge the usefulness of actions as stage setting with respect to the goal of the agent. This paper experimentally investigates this hypothesis studying the integration of immediate reinforcements (also called training reinforcements) with standard delayed reinforcements (namely, reinforcements assigned only when the agent-environment relationship reaches a peculiar state, such as when the agent reaches a target). The paper proposes two new algorithms (TL and MTL) able to exploit even locally wrong and misleading training reinforcements. The proposed algorithms are tested against Q-learning and other algorithms (AB-LEC and BB-LEC) described in the literature1 which also make use of training reinforcements. Experiments are run in a grid world where a Q-agent, a simple simulated robot, must learn to reach a target. Accepted for publication in International Journal of Intelligent Systems, 1997. In press.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multigrid Q-learning

Reinforcement learning scales poorly when reinforcements are delayed. The problem of propagating information from delayed reinforcements to the states and actions that have an e ect the reinforcement is similar to the problem of propagating information in a discretized boundary value problem. Multigrid methods have been shown to decrease the number of updates required to solve boundary value pr...

متن کامل

Genetic Encoding of Agent Behavioral Strategy

The general framework tackled in this paper is the automatic generation of intelligent collective behaviors using genetic programming and reinforcement learning. We define a behavior-based system relying on automatic design process using artificial evolution to synthesize high level behaviors for autonomous agents. Behavioral strategies are described by tree-based structures, and manipulated by...

متن کامل

Twelfth National Conference on Arti cial Intelligence ( AAAI - 94 ) . Incorporating Advice into Agents that Learn from Reinforcements

Incorporating Advice into Agents that Learn from Reinforcements Richard Maclin Jude W. Shavlik Computer Sciences Dept., University of Wisconsin 1210 West Dayton Street Madison, WI 53706 Email: fmaclin,[email protected] Abstract Learning from reinforcements is a promising approach for creating intelligent agents. However, reinforcement learning usually requires a large number of training epis...

متن کامل

Incorporating Advice into Agents that Learn from Reinforcements

Learning from reinforcements is a promising approach for creating intelligent agents. However, reinforcement learning usually requires a large number of training episodes. We present an approach that addresses this shortcoming by allowing a connectionist Q-learner to accept advice given, at any time and in a natural manner, by an external observer. In our approach, the advice-giver watches the ...

متن کامل

The Role of the Trainer in Reinforcement Learning

In this paper we propose a three-stage incremental approach to the development of autonomous agents. We discuss some issues about the characteristics which differentiate reinforcement programs (RPs), and define the trainer as a particular kind of RP. We present a set of results obtained running experiments with a trainer which provides guidance to the AutonoMouse, our mouse-sized autonomous rob...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Int. J. Intell. Syst.

دوره 12  شماره 

صفحات  -

تاریخ انتشار 1997